Crossmodal and incremental perception of audiovisual cues to emotional speech.

نویسندگان

Pashiera Barkhuysen

Emiel Krahmer

Marc Swerts

چکیده

In this article we report on two experiments about the perception of audiovisual cues to emotional speech. The article addresses two questions: 1) how do visual cues from a speaker's face to emotion relate to auditory cues, and (2) what is the recognition speed for various facial cues to emotion? Both experiments reported below are based on tests with video clips of emotional utterances collected via a variant of the well-known Velten method. More specifically, we recorded speakers who displayed positive or negative emotions, which were congruent or incongruent with the (emotional) lexical content of the uttered sentence. In order to test this, we conducted two experiments. The first experiment is a perception experiment in which Czech participants, who do not speak Dutch, rate the perceived emotional state of Dutch speakers in a bimodal (audiovisual) or a unimodal (audio- or vision-only) condition. It was found that incongruent emotional speech leads to significantly more extreme perceived emotion scores than congruent emotional speech, where the difference between congruent and incongruent emotional speech is larger for the negative than for the positive conditions. Interestingly, the largest overall differences between congruent and incongruent emotions were found for the audio-only condition, which suggests that posing an incongruent emotion has a particularly strong effect on the spoken realization of emotions. The second experiment uses a gating paradigm to test the recognition speed for various emotional expressions from a speaker's face. In this experiment participants were presented with the same clips as experiment I, but this time presented vision-only. The clips were shown in successive segments (gates) of increasing duration. Results show that participants are surprisingly accurate in their recognition of the various emotions, as they already reach high recognition scores in the first gate (after only 160 ms). Interestingly, the recognition scores raise faster for positive than negative conditions. Finally, the gating results suggest that incongruent emotions are perceived as more intense than congruent emotions, as the former get more extreme recognition scores than the latter, already after a short period of exposure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the role of crossmodal prediction in audiovisual emotion perception

Humans rely on multiple sensory modalities to determine the emotional state of others. In fact, such multisensory perception may be one of the mechanisms explaining the ease and efficiency by which others' emotions are recognized. But how and when exactly do the different modalities interact? One aspect in multisensory perception that has received increasing interest in recent years is the conc...

متن کامل

Crossmodal Integration and McGurk-Effect in Synthetic Audiovisual Speech

This paper presents the results of a study investigating crossmodal processing of audiovisually synthesised speech stimuli. The perception of facial gestures has a great influence on the interpretation of a speech signal. Not only paralinguistic information of the speakers emotional state or motivation can be obtained. Especially if the acoustic signal is unclear, e.g. because of background noi...

متن کامل

Seeing and Hearing Speech, Sounds, and Signs: Functional Magnetic Resonance Imaging Studies on Fluent and Dyslexic Readers

............................................................................................... v Introduction............................................................................................. 1 Review of the literature ............................................................................... 3 Auditory speech perception ............................................................

متن کامل

Audiovisual speech integration: modulatory factors and the link to sound symbolism

In this talk, I will review some of the latest findings from the burgeoning literature on the audiovisualintegration of speech stimuli. I will focus on those factors that have been demonstrated to influence thisform of multisensory integration (such as temporal coincidence, speaker/gender matching, andattention; Vatakis & Spence, 2007, 2010). I will also look at a few of the oth...

متن کامل

Reduced efficiency of audiovisual integration for nonnative speech.

The role of visual cues in native listeners' perception of speech produced by nonnative speakers has not been extensively studied. Native perception of English sentences produced by native English and Korean speakers in audio-only and audiovisual conditions was examined. Korean speakers were rated as more accented in audiovisual than in the audio-only condition. Visual cues enhanced word intell...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Language and speech

دوره 53 Pt 1 شماره

صفحات -

تاریخ انتشار 2010

Crossmodal and incremental perception of audiovisual cues to emotional speech.

نویسندگان

چکیده

منابع مشابه

On the role of crossmodal prediction in audiovisual emotion perception

Crossmodal Integration and McGurk-Effect in Synthetic Audiovisual Speech

Seeing and Hearing Speech, Sounds, and Signs: Functional Magnetic Resonance Imaging Studies on Fluent and Dyslexic Readers

Audiovisual speech integration: modulatory factors and the link to sound symbolism

Reduced efficiency of audiovisual integration for nonnative speech.

عنوان ژورنال:

اشتراک گذاری